January 17, 2018

Machine Learning vs Programing

Machine Learning vs Statistics

linear \(\Rightarrow\) non-linear

additive \(\Rightarrow\) interactions

theory-driven \(\Rightarrow\) optimization-driven <

Black Box Problem

A client wants you to predict data scientist salaries with machine learning.

Let’s predict data scientist salaries

What is Machine Learning?

Step 1: Find some data

Step 1: Find some data

Kaggle conducted an industry-wide survey of data scientists. https://www.kaggle.com/kaggle/kaggle-survey-2017

Information asked:

  • Compensation
  • Demographics
  • Job title
  • Experience
  • …

Contains information from Kaggle ML and Data Science Survey, 2017, which is made available here under the Open Database License (ODbL).

Step 2: Throw ML on your data

Step 3: Profit

Client: “There is a problem with the model!”

“What problem?”

Client: “The older the candidates, the higher the predicted salaries.”

Looking inside the black box

How do the features influence my predictions?

Partial Dependence Plot

Goldstein, A., Kapelner, A., Bleich, J., & Pitkin, E. (2013). Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation, 1–22. https://doi.org/10.1080/10618600.2014.907095

Friedman, J. H. (1999). Greedy Function Approximation : A Gradient Boosting Machine. North, 1(3), 1–10. https://doi.org/10.2307/2699986

Client: “We want to understand the model better!”

What are the most important features?

Permutation feature importance

Breiman, Leo. “Random forests.” Machine learning 45.1 (2001): 5-32.

Gender?!

What’s the influence of gender on the prediction?

Gender .y.hat
Male 50564.82
Female 47168.67
Non-binary, genderqueer, or gender non-conforming 49904.63
A different identity 49589.76

Explaining individual predictions

Local Models (LIME)

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Retrieved from http://arxiv.org/abs/1602.04938

What tools do we have?

Interpretable Models

Interpretable Models

Intepretable Model: Linear Regression

Intepretable Model: Decision Tree

Interpretable Model: Decision Rules

IF \(90m^2\leq \text{size} < 110m^2\) AND location \(=\) “good” THEN rent is between 1540 and 1890 EUR

Model-specific methods

Model-specific methods

Model-specific methods

TODO: Example for CNNs

Model-specific methods

TODO: Example for text (RNNs and attention?)

Model-agnostic methods

Model-agnostic methods

Model-agnostic Methods

TODO: PDP gif

Model-agnostic methods

TODO: Feature importance figure

Model-agnostic methods: Global Surrogate

Model-agnostic methods: Local Surrogate

Example-based Methods

Example-focused Methods

Example-focused Methods

TODO: Graphic for counterfactuals

Example-focused Methods

TODO: Graphic for prototypes

Interested in learning more?